Notes 11/11: * figure out dcast issues * replacing NA/0s within pipe

1 Data overview:

  • Fish follows (2 min) conducted at sites in Antigua (6), Barbuda (3) and Bonaire (4) from March-August 2017
  • Follows tracked time spent grazing, bite rates, and competitive interactions (among scarids and with damselfish)
  • Follows targeted Sparisoma viride, Scarus vetula, and Sparisoma aurofrenatum of both initial and terminal phases
  • Sparisoma aurofrenatum were not followed in Bonaire, but added in Antigua and Bonaire due to low abundances of other two species. Some sites in Antigua and Barbuda had no/low represenation from terminal phase viride and vetula
  • Most eventual analyses will likely focus exclusively on initial phase (standardized size window) S. viride and S. vetula
  • site level factors (benthic, fish, and rugosity) assessed at each of the 13 sites

Need to determine appropriate size range for comparison. Because they are not evenly distributed (i.e. larger fish in Bonaire, smaller in Barbuda), I will likely want to compare length-feeding relationships as opposed to pooled averages

2 Examing variable distributions and relationships

2.1 Predictor variables

Potential predictor variables are site-level fish, benthic, and rugosity values. These are likely correlated to one another, and I need to determine which ones I ultimately want to use (if modeling behavioral responses via any multivariate regressions). I can also move to SEM if I want to keep multiple correlated predictors.

First, check distribution of predictor variables of interest: not very normally distributed…

Variable selection notes: - excluding both carnivore variables as they are highly correlated with scarid biomass and total biomass, eventually I could make these more nuanced by distinguishing actual predators, but right now I don’t think it reflects actual predator populations of >15cm parrotfish - rugosity is highly correlated with turf cover, and scarid density - scarid density: removing for now, because I think it was a bit skewed from Barbuda juveniles - could eventually use consp. scarid length as another indicator of overfishing?

PCA to visualize variable relationships:

PCA for correlated (benthic only?) variables:

## Importance of components:
##                          PC1    PC2    PC3    PC4     PC5
## Standard deviation     1.754 1.1961 0.5865 0.2898 0.25726
## Proportion of Variance 0.615 0.2862 0.0688 0.0168 0.01324
## Cumulative Proportion  0.615 0.9012 0.9700 0.9868 1.00000

2.2 Response variables

Fish-level grazing behaviors (as well as competitive interaction frequency)

2.2.1 Distributions

Variable selection notes: - for_bites is correlated with fr and for_dur, but I will play around with keeping it for now.

3 Grazing boxplots

##             Df   Sum Sq Mean Sq F value   Pr(>F)    
## island       2 19329868 9664934   24.48 5.08e-09 ***
## Residuals   80 31586112  394826                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = fr ~ island, data = vet)
## 
## $island
##                      diff       lwr       upr     p adj
## Barbuda-Antigua -121.5556 -670.7081  427.5968 0.8575559
## Bonaire-Antigua  936.8111  485.8992 1387.7231 0.0000115
## Bonaire-Barbuda 1058.3667  630.3282 1486.4053 0.0000002

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2   0.731 0.4846
##       80
##             Df  Sum Sq Mean Sq F value   Pr(>F)    
## island       2 3522230 1761115    26.9 5.52e-10 ***
## Residuals   95 6218718   65460                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = fr ~ island, data = vir, white.adjust = T)
## 
## $island
##                      diff       lwr      upr     p adj
## Barbuda-Antigua 338.84411 180.14219 497.5460 0.0000055
## Bonaire-Antigua 408.89527 267.59360 550.1969 0.0000000
## Bonaire-Barbuda  70.05115 -94.41711 234.5194 0.5698132

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2  0.9125  0.405
##       95

##             Df Sum Sq Mean Sq F value  Pr(>F)    
## island       2  1.582  0.7911   22.98 1.3e-08 ***
## Residuals   80  2.753  0.0344                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = g_frac ~ island, data = vet, white.adjust = T)
## 
## $island
##                        diff        lwr       upr     p adj
## Barbuda-Antigua -0.02121967 -0.1833515 0.1409122 0.9476109
## Bonaire-Antigua  0.27575852  0.1426312 0.4088858 0.0000122
## Bonaire-Barbuda  0.29697818  0.1706040 0.4233524 0.0000008

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2  0.1837 0.8325
##       80
##             Df Sum Sq Mean Sq F value  Pr(>F)    
## island       2  2.808  1.4040   26.13 9.1e-10 ***
## Residuals   95  5.105  0.0537                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = g_frac ~ island, data = vir, white.adjust = T)
## 
## $island
##                       diff         lwr       upr     p adj
## Barbuda-Antigua 0.30233003  0.15853810 0.4461220 0.0000076
## Bonaire-Antigua 0.36517814  0.23715172 0.4932046 0.0000000
## Bonaire-Barbuda 0.06284811 -0.08616841 0.2118646 0.5760464

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2  0.7501 0.4751
##       95

##             Df Sum Sq Mean Sq F value Pr(>F)
## island       2  0.314  0.1571   1.448  0.241
## Residuals   80  8.676  0.1085
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = br ~ island, data = vet)
## 
## $island
##                         diff         lwr       upr     p adj
## Barbuda-Antigua -0.008825416 -0.29663426 0.2789834 0.9970480
## Bonaire-Antigua  0.123227748 -0.11309360 0.3595491 0.4303229
## Bonaire-Barbuda  0.132053164 -0.09228034 0.3563867 0.3427971

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value    Pr(>F)    
## group  2  10.408 9.601e-05 ***
##       80                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##             Df Sum Sq Mean Sq F value Pr(>F)
## island       2  0.045 0.02253   0.502  0.607
## Residuals   95  4.267 0.04491
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = br ~ island, data = vir, white.adjust = T)
## 
## $island
##                         diff        lwr        upr     p adj
## Barbuda-Antigua -0.002050384 -0.1335072 0.12940643 0.9992399
## Bonaire-Antigua -0.045754540 -0.1627983 0.07128921 0.6222646
## Bonaire-Barbuda -0.043704156 -0.1799374 0.09252906 0.7260125

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value    Pr(>F)    
## group  2   10.63 6.823e-05 ***
##       95                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##             Df Sum Sq Mean Sq F value  Pr(>F)   
## island       2   1457   728.5   7.298 0.00126 **
## Residuals   76   7586    99.8                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 4 observations deleted due to missingness
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = for_bites ~ island, data = vet)
## 
## $island
##                        diff       lwr       upr     p adj
## Barbuda-Antigua -0.07948718 -9.447089  9.288114 0.9997732
## Bonaire-Antigua  9.09956631  1.707813 16.491319 0.0118676
## Bonaire-Barbuda  9.17905349  1.787300 16.570807 0.0110379

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  2  2.1906 0.1189
##       76
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## island       2  903.1   451.5   13.19 9.77e-06 ***
## Residuals   88 3012.5    34.2                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 7 observations deleted due to missingness
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = for_bites ~ island, data = vir, white.adjust = T)
## 
## $island
##                     diff         lwr       upr     p adj
## Barbuda-Antigua 3.599465 -0.14464682  7.343576 0.0621841
## Bonaire-Antigua 7.291484  3.90697129 10.675997 0.0000050
## Bonaire-Barbuda 3.692020 -0.09681671  7.480856 0.0578051

## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value  Pr(>F)   
## group  2  5.0607 0.00831 **
##       88                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

4 Exploratory bivariate plots

4.1 Grazing as a function of fish length

Note: remove G1 grazing instances here?

4.2 Grazing as a function of site traits

Notes: - scarid biomass is not the best predictor once I account for differences between my samples in terms of the sizes of fish I was sampling. I think the grazing/length relationships are much stronger.
- restraining sample size to length windows lowers sample size and makes trends much less pronounced - esp. for phase differences
- reducing sample to individual phase only also blurs trends

4.3 Competitive interactions

5 Model trials

  • site-level predictors: scar_bm,scar_den,carn_bm,benthic (pc1,pc2)
  • fish-level predictors: species, phase, length
  • eventually run separately for different species
  • species*scar_bm interaction?
  • random effects: island
  • response variables: g_frac, br (?), and fr (run separately)

5.1 Mixed effects models

  • random effect: island or site?
## Linear mixed-effects model fit by REML
##  Data: filter(sum_id_pca1, species_code != "rbp") 
##        AIC      BIC    logLik
##   9300.792 9340.572 -4641.396
## 
## Random effects:
##  Formula: ~1 | site
##         (Intercept) Residual
## StdDev:    162.6097 436.1899
## 
## Fixed effects: fr ~ phase + length_cm + species + scar_bm + pc1 + pc2 
##                             Value Std.Error  DF    t-value p-value
## (Intercept)             1314.6465 135.44162 605   9.706370  0.0000
## phaset                  -157.8237  49.97291 605  -3.158184  0.0017
## length_cm                -10.1823   3.50964 605  -2.901238  0.0039
## speciesSparisoma viride -621.9076  36.49852 605 -17.039257  0.0000
## scar_bm                   -0.0187   0.05582   9  -0.335883  0.7447
## pc1                     -154.4773  46.44156   9  -3.326273  0.0089
## pc2                        0.4426  44.47607   9   0.009952  0.9923
##  Correlation: 
##                         (Intr) phaset lngth_ spcsSv scr_bm pc1   
## phaset                   0.325                                   
## length_cm               -0.560 -0.657                            
## speciesSparisoma viride -0.202 -0.057  0.068                     
## scar_bm                 -0.677  0.021 -0.073 -0.001              
## pc1                     -0.546  0.007  0.002 -0.042  0.782       
## pc2                     -0.090 -0.059  0.068 -0.009  0.072  0.047
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -2.83450026 -0.56865286 -0.01818742  0.54135788  5.22633799 
## 
## Number of Observations: 621
## Number of Groups: 13
##     phase length_cm   species   scar_bm       pc1       pc2 
##  1.771002  1.803343  1.010440  2.644180  2.621751  1.011652

5.2 GAMM

  • for GAMMs - seems model results are the same whether random = site or island?

To Do as of Nov. 7
* boosted regression trees ecosphere 2017 adrians paper
* species as random effect